|
|
|
| Home | Wayback Machine | Archive-It | Blog | Heritrix |
| Anonymous User (login or join us) |
Internet Archive crawldata from Aaron Swartz Crawl, captured by crawl345.us.archive.org:aaronswartz from Tue Jan 15 14:31:50 PST 2013 to Tue Jan 15 07:05:01 PST 2013.
This item is part of the collection: Away from Keyboard: Aaron H. Swartz
Identifier: AS-20130115143150-crawl345
Contributor: Internet Archive
Crawljob: aaronswartz
Creator: Internet Archive
Date: 2013
Firstfiledate: 20130115143238
Firstfileserial: 01018
Identifier-access: http://www.archive.org/details/AS-20130115143150-crawl345
Lastdate: 20130115070501
Lastfiledate: 20130115150346
Lastfileserial: 01027
Mediatype: web
Numwarcs: 10
Operator: lekash@archive.org
Scandate: 20130115143238
Scanner: crawl345.us.archive.org
Scanningcenter: sanfrancisco
Sizehint: 10563599234
Sponsor: Internet Archive
Publicdate: 2013-01-15 21:01:49
Addeddate: 2013-01-15 21:01:49
Imagecount: 228765
Keywords: crawldata
| Information | Format | Size |
| AS-20130115143150-crawl345_files.xml | Metadata | [file] |
| AS-20130115143150-crawl345_meta.xml | Metadata | 1.4 KB |
| Other Files | Web ARChive GZ | WARC CDX Index | Item CDX Index | Item CDX Meta-Index | Text |
| AS-20130115143150-01018.warc.gz |
960.8 MB
|
1.2 MB
|
|||
| AS-20130115143150-crawl345.cdx.gz |
12.7 MB
|
||||
| AS-20130115143150-crawl345.cdx.idx |
8.2 KB
|
||||
| AS-20130115143446-01019.warc.gz |
963.5 MB
|
2.3 MB
|
|||
| AS-20130115143805-01020.warc.gz |
953.7 MB
|
2.5 MB
|
|||
| AS-20130115144224-01021.warc.gz |
985.2 MB
|
1.3 MB
|
|||
| AS-20130115144542-01022.warc.gz |
962.1 MB
|
1.4 MB
|
|||
| AS-20130115144829-01023.warc.gz |
958.6 MB
|
1.1 MB
|
|||
| AS-20130115145130-01024.warc.gz |
959.2 MB
|
1.1 MB
|
|||
| AS-20130115145433-01025.warc.gz |
1.4 GB
|
482.2 KB
|
|||
| AS-20130115145821-01026.warc.gz |
960.2 MB
|
988.8 KB
|
|||
| AS-20130115150214-01027.warc.gz |
953.7 MB
|
1.2 MB
|
|||
| MANIFEST.txt |
660.0 B
|